Financial variables that impact BorrowerRates

Investigation Overview

The project will revolve around the investigation of home loans and the factors that directly impact the borrow rates provided by financial lending institutions, in this case Prosper, to lendees. The main focus variables from the available data set are:

However of the above variables, only a select few proved to be influential to affecting borrow rates.

Dataset Overview

The loans dataset provides 113,937 loan listings provided by Prosper across the span of 8 years between the periods of 2009 and 2014. There are 81 variables in total of which were trimmed down to the variables previously mentioned above. ProsperRatings and CreditGrades variables cater to different time periods and were merged, similarly with ProsperRating (numerical) in order to reduce Nan values, as seen in the next plot.

There were several category values that provided little to no information to the analysis and were purged, for instance employment types such as Employed and Other.

Functions

Datatypes

Date time correction
Date Year and Month extraction

Category re-definition

Column type definitions

Nan/Missing Data

Distribution of Borrower Rate

The violin plot provides a holistic overview of the BorrowerRate distribution as well as indicating the median and IQR.
The hist plots support the information portrayed in the violin plot providing a numerical scale of the spread.
The logarithmic plot is used to catch values difficult to see in the regular plot.
From the 3 plots above, we can see it is multimodal due to the three peaks.

Distribution of Employment Status

Employment status is a criteria that influences the loanees borrowing power, thus affecting the rate they are provided. Of the employment types, Full-time has the best guarantee of having low BorrowRates evident with the width/frequency between 0.08-0.1% being more dominant then other types. Full-time also has the lowest first quartile.

Distribution of Income Range

Income range is deemed to have an impact on the BorrowerRate evident with higher incomes having the lowest rates, specifically there medians interquartile range (IQR) and with the tail having a smaller width from 0.20 onwards.
This makes sense as they have a better ability to pay back the loan due to the larger income available to them.

BorrowerRate by Credit Rating

Credit Rating is one of the main variables that contribute to having a lower borrow rate on a loan.
The graph below clearly depicts the distribution of each respective rating, with AA rating being prevalent in the lower borrower rate regions soon followed after by the subsequent ratings A through to HR (high risk).

Exploration of Occupations

After analysing the key variables that effect Borrower Rates, we delve into the Occupations that ultimately show if there is preferential treatment/ a set of rates provided to a specific Occupation.

In the box plot above, we can see a better capture of the Occupations relationship to BorrowerRate. It is evident that Occupations do impact borrow rates as they are a subset of EmploymentStatus evident with Judge and Doctors appearing to have the lowest median borrow rates, where as Student College Freshman and Teachers aid appear to have the highest.

Due to the quantity of the Occupations list, a select few professions will be assessed to ensure a wide proportion of the population is captured. This selection is visual based with attempts to collect a low and high borrow rate of each field i.e. business, law, engineering, public service, admin, etc.